AITopics | english alphabet

Collaborating Authors

english alphabet

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Bypassing Prompt Guards in Production with Controlled-Release Prompting

Fairoze, Jaiden, Garg, Sanjam, Lee, Keewoo, Wang, Mingyuan

arXiv.org Artificial IntelligenceOct-8-2025

As large language models (LLMs) advance, ensuring AI safety and alignment is paramount. One popular approach is prompt guards, lightweight mechanisms designed to filter malicious queries while being easy to implement and update. In this work, we introduce a new attack that circumvents such prompt guards, highlighting their limitations. Our method consistently jailbreaks production models while maintaining response quality, even under the highly protected chat interfaces of Google Gemini (2.5 Flash/Pro), DeepSeek Chat (DeepThink), Grok (3), and Mistral Le Chat (Magistral). The attack exploits a resource asymmetry between the prompt guard and the main LLM, encoding a jailbreak prompt that lightweight guards cannot decode but the main model can. This reveals an attack surface inherent to lightweight prompt guards in modern LLM architectures and underscores the need to shift defenses from blocking malicious inputs to preventing malicious outputs. We additionally identify other critical alignment issues, such as copyrighted data extraction, training data extraction, and malicious response leakage during thinking.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2510.01529

Country: North America > United States (0.92)

Genre:

Research Report (1.00)
Instructional Material > Course Syllabus & Notes (0.46)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Mitigating the Problem of Strong Priors in LMs with Context Extrapolation

Douglas, Raymond, Draguns, Andis, Gavenčiak, Tomáš

arXiv.org Artificial IntelligenceJan-31-2024

Language models (LMs) have become important tools in a variety of applications, from data processing to the creation of instruction-following assistants. But despite their advantages, LMs have certain idiosyncratic limitations such as the problem of `strong priors', where a model learns to output typical continuations in response to certain, usually local, portions of the input regardless of any earlier instructions. For example, prompt injection attacks can induce models to ignore explicit directives. In some cases, larger models have been shown to be more susceptible to these problems than similar smaller models, an example of the phenomenon of `inverse scaling'. We develop a new technique for mitigating the problem of strong priors: we take the original set of instructions, produce a weakened version of the original prompt that is even more susceptible to the strong priors problem, and then extrapolate the continuation away from the weakened prompt. This lets us infer how the model would continue a hypothetical strengthened set of instructions. Our technique conceptualises LMs as mixture models which combine a family of data generation processes, reinforcing the desired elements of the mixture. Our approach works at inference time, removing any need for retraining. We apply it to eleven models including GPT-2, GPT-3, Llama 2, and Mistral on four tasks, and find improvements in 41/44. Across all 44 combinations the median increase in proportion of tasks completed is 40%.

completion, instruction, language model, (15 more...)

arXiv.org Artificial Intelligence

2401.17692

Country: Europe > Czechia > Prague (0.04)

Genre: Research Report (1.00)

Industry: Information Technology (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

An Analysis of Letter Dynamics in the English Alphabet

Zhao, Neil, Zheng, Diana

arXiv.org Artificial IntelligenceJan-27-2024

The tabulation of commonly used letters, as determined by letter frequency, was later utilized to improve typewriter keyboard arrangement by minimizing hand motion [5]. Statistical characteristics of different letters of the English alphabet was further studied in the context of different sentence structures [6]. The letters'B', 'S', 'M', 'H', 'C' were found to most frequently occur as the initial letters of proper nouns, while'E', 'A', 'R', 'N' were the most frequently used letters when the entire proper noun is considered. For entire text documents, the most commonly used letters were found to be'E', 'T', 'A', 'O', 'N'. Interestingly, 95% of the English vocabulary was found to be represented by 13 letters of the alphabet. Our manuscript expanded upon the statistical study of the English alphabet by evaluating letter frequency in the context of different categories of writings. We analyzed news articles, novels, plays, and scientific articles for letter frequency and distribution. As a result, we determined the information density of the letters of the alphabet. Additionally, we developed a metric called "distance, d" to act as a simple algorithm for recognizing writing category.

category, frequency, letter frequency, (16 more...)

arXiv.org Artificial Intelligence

2401.1556

Country:

North America > United States > Oregon (0.04)
South America > Brazil (0.04)
North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
(7 more...)

Genre: Research Report (0.64)

Industry:

Materials > Chemicals (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
(2 more...)

Technology: Information Technology > Artificial Intelligence (0.46)

Add feedback

Language Detection for Transliterated Content

S, Selva Kumar, Khan, Afifah Khan Mohammed Ajmal, Manjeshwar, Chirag, Banday, Imadh Ajaz

arXiv.org Artificial IntelligenceJan-9-2024

In the contemporary digital era, the Internet functions as an unparalleled catalyst, dismantling geographical and linguistic barriers particularly evident in texting. This evolution facilitates global communication, transcending physical distances and fostering dynamic cultural exchange. A notable trend is the widespread use of transliteration, where the English alphabet is employed to convey messages in native languages, posing a unique challenge for language technology in accurately detecting the source language. This paper addresses this challenge through a dataset of phone text messages in Hindi and Russian transliterated into English utilizing BERT for language classification and Google Translate API for transliteration conversion. The research pioneers innovative approaches to identify and convert transliterated text, navigating challenges in the diverse linguistic landscape of digital communication. Emphasizing the pivotal role of comprehensive datasets for training Large Language Models LLMs like BERT, our model showcases exceptional proficiency in accurately identifying and classifying languages from transliterated text. With a validation accuracy of 99% our models robust performance underscores its reliability. The comprehensive exploration of transliteration dynamics supported by innovative approaches and cutting edge technologies like BERT, positions our research at the forefront of addressing unique challenges in the linguistic landscape of digital communication. Beyond contributing to language identification and transliteration capabilities this work holds promise for applications in content moderation, analytics and fostering a globally connected community engaged in meaningful dialogue.

communication, english alphabet, transliteration, (14 more...)

arXiv.org Artificial Intelligence

2401.04619

Country: Asia > India > Karnataka > Bengaluru (0.07)

Genre:

Overview > Innovation (0.76)
Research Report > Promising Solution (0.56)

Industry: Telecommunications (0.71)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.55)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.36)

Add feedback

Urdu News Article Recommendation Model using Natural Language Processing Techniques

Abbas, Syed Zain, Rahman, Arif ur, Mughal, Abdul Basit, Haider, Syed Mujtaba

arXiv.org Artificial IntelligenceMay-29-2022

There are several online newspapers in urdu but for the users it is difficult to find the content they are looking for because these most of them contain irrelevant data and most users did not get what they want to retrieve. Our proposed framework will help to predict Urdu news in the interests of users and reduce the users searching time for news. For this purpose, NLP techniques are used for pre-processing, and then TF-IDF with cosine similarity is used for gaining the highest similarity and recommended news on user preferences. Moreover, the BERT language model is also used for similarity, and by using the BERT model similarity increases as compared to TF-IDF so the approach works better with the BERT language model and recommends news to the user on their interest. The news is recommended when the similarity of the articles is above 60 percent.

news article, recommendation, similarity, (14 more...)

arXiv.org Artificial Intelligence

2206.11862

Country:

Asia > Pakistan > Islamabad Capital Territory > Islamabad (0.05)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
Europe > Germany > Hesse > Darmstadt Region > Wiesbaden (0.04)
Asia > Singapore (0.04)

Genre: Research Report (0.64)

Industry: Media > News (0.86)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Accelerography: Feasibility of Gesture Typing using Accelerometer

Chowdhury, Arindam Roy, Dalal, Abhinandan, Sen, Shubhajit

arXiv.org Machine LearningMar-29-2020

In this paper, we aim to look into the feasibility of constructing alphabets using gestures. The main idea is to construct gestures, that are easy to remember, not cumbersome to reproduce and easily identifiable. We construct gestures for the entire English alphabet and provide an algorithm to identify the gestures, even when they are constructed continuously. We tackle the problem statistically, taking into account the problem of randomness in the hand movement gestures of users, and achieve an average accuracy of 97.33% with the entire English alphabet.

algorithm, axis, cutoff, (16 more...)

arXiv.org Machine Learning

2003.1431

Country:

Asia > India > West Bengal > Kolkata (0.05)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Missouri > Jackson County > Kansas City (0.04)
Asia > India > Karnataka > Bengaluru (0.04)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback